DOCOHEMT BESOHE 



ED 092 175 

AOTHOR 
TITLE 

INSTITOTION 

SPONS AGENCY 
PUB DATE 
CONTRACT 
NOTE 

BDRS PRICE 
DESCRIPTORS 



95 



IR 000 742 



IDENTIFIERS 



ABSTRACT 



Pinsky^ Paul D^ 

Achievenent Monitoring of Individually Paced 
Instruction, Pinal Report. 

Sequoia Union High School District, Redwood City, 
Calif, 

Office of Education (DHEW), Washington, D.C. 
Oct 73 

OEC-9-72-0012 

31p.; Study conducted at San Carlos High School 
MF-$0,75 HC-$n85 PLUS POSTAGE 

Achievement Testsj ♦Computer Assisted Instruction; 
Criterion Referenced Tests; Earth Science; Field 
Studies; *Formative Evaluation; High School Students; 
♦Individualized Instruction; ♦Secondary Grades; 
Summative Evaluation; ♦Test Construction 
♦Comprehensive Achievement Monitoring 



A study was made to monitor achievement of 
individually paced instruction. The project concentrated on designing 
testing procedures in group paced instructional programs to provide 
information to student, teachers, parents and administrators which 
could be used in both a formative and summative evaluation. The three 
objectives of the project were: (1) to adapt the Comprehensive 
Achievement Monitoring (CAM) design for an individually paced program 
of instruction that contains a series of units through which students 
progress in sequence; (2) explore the applicability of 
computer-assisted instruction evaluation technique to criterion 
referenced testing (CRT) for individually paced instruction; and (3) 
to field test the adopted CAM design in a high school earth science 
course. The results showed quite strongly that the students whose 
learning activities were controlled the most showed the greatest 
gains in achievement levels. Gains were measured by the CAM tests and 
by standardized tests given at the beginning and 
The results show this population of students are 
independently with CRT data and direct their own 



This finding 
years. (WCM) 



end of the course, 
not able to work 
study activities. 



confirms less formal studies completed in previous 



ERLC 



kis o£ MARtwe^t or Mt rn 

EOUC A DON 

iH f K"> r » i r u Y 5 w: I i: I i a c * w ov 
T>i^ y i wsosow (•:H':■»^s■ir''^•'■■.^ cwii..:,!N 
ATiso n po NT'-c» ■•. .1..% C'p .:t',^'0^^ 
sii^UDOo SOT Awi>. < wft^wj. 

t Due A I* lOS *»OSi ^ sON OV f'Oi^r y 



FINAL REPORT 

ACHIEVEMEirr MONITORING 
OF 

INDIVIDUALLY PACED INSTRUCTION 
OEC-9-72-0012 



Paul D. Pinsky 
October, 1973 



I 



ERIC 



CONTENTS 

1. Introduction 1 

2. Evaluation Designs for Individually Paced Instruction 3 

3. The Application of Computer-Assisted Instruction 

to Criterion-Referenced Testing l8 

k. Earth Science Criterion-Referenced Testing Results 20 

References 2^ 



1 • Introduction 



The report discusses achievement monitoring of individually paced 
instruction » During the past decade many educators have advocated 
changing the educational system to provide insbruction that is designed 
to better serve the individual needs of students (1,2) • A common prac- 
tice used in many schools today is the individual pacing of students 
through a fixed sequence of content material (3). This technique of 
instruction, however, presents certain problems concerning the evalua- 
tion and control of the learning process • The most common mechanism for 
controlling the instructional activities of students is through a pre- 
test-posttest design. In this design the student takes a criterion- 
referenced test (CRT) on the unit of material before instruction has 
occurred. Based upon the results of the pretest, he directs his study 
activities, vmen the student feeiS that he has learned the material in 
the unit, he takes a parallel posttest* If he masters the posttest, he 
goes on to the pretest for the next unit; if he does poorly on the post- 
test, he continues to review the objectives In the unit. 

The pretest-posttest control mechanism, however, does have certain 
disadvantages. For example, the amount of clerical work necessary to 
run such a program is substantial. Most programs using the above 
testing procedures have several forms of each unit posttest, and the 
instructor must provide the correct form for each student. Further, the 
pretests and posttests are usually processed by members of the instruc- 
tional staff » Several projects throughout the country are using com- 
puter technology to help alleviate this clerical problem (i|,5). The 
initial findings, however, indicate that even with a computer system, 
the cost of operating such a testing: program is high. 

The recently completed Comprehensive Achievement Monitoring (CAM) 
project explored the use of sampling techniques to help provide the 
types of information described above (6). The project concentrated on 
designing testing procedures in group paced instructional programs to 
provide information to students, teachers, parents and administrators 
that could be used in both a formative and summative fashion. The CAM 
system is currently being used in over 100 school districts throughout 
the country. 

The three objectives of the project are to: 

a. ) Adapt the CAM design for an individually paced program of 
instuction that contains a series of units through which students pro- 
gress in sequence. The adaption should provide more information for 
decision-making about individuals and at the same time retain some of 
the information about groups of students. 

b. ) Explore the applicability of computer-assisted instruc- 
tion (CAI) evaluation technique to criterion referenced testing (CRT) 
for individually paced instruction. 

c. ) Field test the adopted CAM design in an earth science 
course at San Carlos High School; San Carlos, California. A major sub- 
objective is to determine the extent to which students can make their 
cr^n decisions regarding instructional activities in an individually 
paced program without the usual pretest-posttest design. 



ERLC 



- 1 - 



The adat^tion of the CAM design for individually paced instruction 
has been completed and is presented in Section 2. The evaluation de- 
signs are based upon the decisions that ara to be supported when the 
data is generated. Educational decisions have been divided into three 
categories or levels. 

Level I: Decisions about individual students based on diag- 
nostic test data, on a pietest, posttest, or reten- 
tion basis . 

Level II: Decisions about instruction on a course, program, 

class, or curriculum basis. 
Level III: Decisions about schools, districts, regions, and the 

entire state, 

Three evaluation models, the unit CAM, the sliding unit CAM, and the 
standard CAM are presented. Terminology is defined that enables one to 
characterize a variety of individually paced evaluation designs in terms 
of the basic models. The three models are compared relative to the type 
of information they generate for Level I and Level II decision making* 

The application of CAT evaluation techniques to CRT situations is 
discussed. From a practical viewpoint, the analysis is not very en- 
couraging. The major advantage of the CAT concept is that a history of 
student performance is stored in the computer. This performance history 
is used to determine the test items to present to each student. Thus, 
testing is done dynamically to account for individual student differ- 
ences and growth over time. The individualized CAM designs are static 
in that they treat all students who are working on the same unit in the 
same manner. That is, the CAM technique does not account for differ- 
ences in student learning patterns within the same unit of material. 

There are several prob}.ems in applying CAT techniques to CRT. The 
first is that the computer would have to print a' customized test for 
each student at each test administration. Problems arise in entering 
the test items into the computer; in printing apecial symbols, charac- 
ters, and diagrams on existing computer output devices; and in the cost 
of generating such tests. Moreover, the cost of data entry into the 
computer, and the adminLstrat ivo costs of keeping track of the large 
numbers of documents going back and forth to the computer would probably 
be prohibitive . 

The most promising approach to applying CAI evaluation techniques 
to the CRT situation might be to do on-line CRT. The cost of computer 
power and the cost of terminal devices (such as cathode-ray tubes - CRT) 
drops every year. Therefore, an area for future research would be to 
devise an on-line CRT system that accounts for individual student dif- 
ferences . 

The field testing of the modified CAM design produced some inter- 
esting results. Because of small sample sizes and circumstances perhaps 
unique to the San Carlos study body, the results must be considered ten- 
tative and should be subjected to additional studies in different en- 
vironments. The study contained 3 classes of approximately 28 students 
each. One class was free to chose their own study materials and to 



ERLC 



- 2 - 



decide when they had leaz^ned these materials* Ariother group was directed 
in their study efforts, but still had some decisions concerning when 
they had learned the material. The third group worked closely with the 
instructional staff concerning what they were to study and when they had 
learned the material. 

The results shw quite strongly that the students whose learning 
activities were controlled the most showed the highest gains in achievement 
levels. These gains were measured by the CAM tests and by standardized 
tests given at the beginning and end of the course. The students in the 
course tend to be underachievers in the science area. Many took the earth 
science course to fulfill the high school graduation requirement in science. 
The results show quite clearly that this population of students are not able 
to work independently with CRT data and direct their own study activities. 
This result confirms less formal studies that have been completed in the 
seme course in previous years. 

2, Evaluation Designs for Individually Paced Instruction 
Decision Le vels 

This part of the report discusses different evaluation designs for 
the use of criterion-referenced testing (CRT) data. These evaluation 
designs reflect the variety of education decisions that are wade using 
the results of CRT data. The type of evaluation design used should 
depend upon the decisions that will be made when the results are gener- 
ated. 

One must recognize that there is a wide variety of decision makers 
in an educational enterprise. A few examples are teachers as a faculty, 
teachers as individuals, students, parents, principals, school commit- 
tees, etc, Each of these people or groups makes decisions about the 
sme educational enterprise » Each, however, makes different kinds of 
decisions about the enterprise, from different perspectives. Because of 
this fact, each needs different kinds of data. 

For purposes of this report, educational decisions are divided into 
three categori.es or levels; 

r.evol I: Decisions about individual students, based on diag- 
nostic test data, on a pretest, posttest, or reten- 
tion basis . 

Level II: Decisions about instruction on a course, program, 

c;iasK, or curriculum basis . 
Level 111: Decisions about schools, districts, regions, and 

the entire state. 

Each level of decision making needs different kinds of data because each 
involves different kinds of decisions by different levels of decision 
makers. This can be better seen by examining some possible decisions on 
each level. 

- 3 - 



ERLC 



Examples of Level I Decisions 

(1) Have the students in the class mastered the prescribed subject 
matter? 

(2) Wnat objectives of instruction did the students know prior to 
instruction? 

(3) What learning did the students retain after instruction? 
(h) What students need additional work on what objectives? 

(5) What students do not need to go through a particular learning 
sequence since they can already perform the skill to be taught? 

These Level I decisions need data about how individual students do in 
relatiw*! to specific learning objectives (p^especified criteria). 

Examples of L ev el II Deci sion s 

(1) What objectives should be added to the curriculum? (or de- 
leted? or modified?) 

(2) Which objectives are none of the students meeting? Why? 

(3) Wliat instructional materials and programs work better in terms 
of student outcomes at each stage? For what students? 

(k) Which instructional modes work in having students achieve 
which objectives? 

(5) Which class(es) are succeeding (or failing) with respect to 
the objectives in a certain course or curriculum? 

These might be some typical Level II decisions. It can be seen immedi- 
ately that this is a larger level of decision making in that the focus 
is no longer on the individual student, but rather on groups of students. 
Each of the above decisions needs group data rather than individual 
student data. This does not mean that the i/jstruction must be "group- 
paced" (indeed, the report is focusing on individually-paced instruction), 
but that the results must be summarized over the individual students to 
facilitate the Level II decision. 

Examples of Level III Decisions 

(1) To what degree are the pupils of the state attaining the goals 
toward which public education is directed? 

(2) To what degree are the pupils of each district attaining the 
goals toward which public education is directed? 

(3) Which districts are attaining unusual success and what factors 
appear to be responsible for that success? 

(h) When new educational programs are introduced into the schools, 
do subsequent changes in pupil accomplishments indicate that the 
program is accomplishing its purposes? 

These kinds of decisions deal with large numbers of individuals, so 
large that the collection of individual student data could swamp a deci- 
sion maker. I In fact, this has been one of the problems with state-wide 
evaluation t'o date. The evaluations have been nr^rrowly conceived, based 
on misrerCGition of tho kind, of data needed to make the decisions. There 
has been confusion between evaluation on the three di fferent levels, or 
a lack av/areness that the three different levels, demanding three 
different levels of data, existed. 

O - ^ - 

ERIC 



The report focuses on decision Levels I and II for individually- 
paced instruction* Evaluation designs for the Level III decisions 
should make extensive use of sampling techniques (?)• This sampling 
might include sampling of districts, schools, buildings, or students, 
and extensive sampling of the content domain (item or matrix sampling 
techniques) » 

A Hypothetical Curriculum Structur e 

A carefully designed course structure facilitates the explanation 
of evaluation designs. The curriculum in the report is idealized, but 
contains the fundamental ingredients of most curricula that a teacher 
would devise for an individually-paced prograri. The course is used to 
present specific examples of the concepts presented in the report. 

The hypothetical course contains eight units, learning activity 
packages, modules, etc. These units are numbered 11, 12, 13, lU, 21, 
22, 23, and 2^. The students are expected to learn the units at their 
own pace in the fixed sequence. Tiie average student will spend approx- 
imately 2 weeks on each unit. Six objectives are included in each unit. 
A four digit identification number for each objective consists of the 
unit number as its first two digits; e.g. Objectives 1101, 1102, 1103, 
110^, 1105, and 1106 are in Unit 11, and Objectives 2301, 2302, 2303, 
230^, 2305, and 2306 are in Unit 23. There are a total of h8 objectives 
in the curriculum because there are eight units with six objectives per 
unit. There are six test ite^s per objective. The six-digit test item 
identification number c;onsists of the objective number as its first four 
digits and a sequential identification number unique to the objective as 
its last two digits: e.g.. Items 210201, 210202, . » . ., 210206 are 
related to Objective 2102, There are a total of 288 test items because 
there are U8 objectives with six items per objective in the curriculum. 
Note that each item is related to one and only one objective^ and that 
each objective is related to one and only one unit. 

Three evaluation models, the unit CAM model, the sliding unit CAM 
model, and the standard CAM model are presented. In each of these 
models, a student responds to 10 test forms, one at the completion of 
each unit, and one at the beginning and end of the program, These ten 
test administrations for each student are numbered from i to 10. Test 
Administration 1 is the pretest of the entire program, and Test Adminis- 
tration 10 is the posttest. Test Administration 2 occurs as each stu- 
dent completes Unit 1; Test Administration 3 occurs as each stadent com- 
pletes Unit 2, etc. 

The hypothetical course has an enrollment of 2h0 students distributed 
amoung eight classes of 30 students each. 

The number of test form.s used differs with each model. However, 
each form in all models contains 2^ itemc Therefore during the indi- 
vidualized program each model generates exactly the same number of stud- 
dent responses to test items. These three models can thus be compared 

- 5 - 



ERLC 



by the equality of information rather than by the quantity they produce* 
However, the report does not suggest that all test forms should contain 
2h items, or that all test forms in an evaluation design should contain 
the ^arne number of items. The curriculum used in the report is hypo- 
thetical and is designed to facilitate the explanation of alternative 
evaluation designs for individufc^' ly-paced instruction. 

The Unit CAM Mod el 

The unit CAM model consists of a pretest of the entire program 
during Test Administration 1, and a posttest of the entire program 
during Tost Administration 10, During each of Test Administrations 2-9, 
a single unit CAM test is administered to each student. Each unit test 
contains four test items related to each of the six objectives Just com- 
pleted. The first two digits of the three digit test form number con- 
tain the unit which the form is measuring. Thus Test Form 115 is given 
immediately following the completion of Unit 11 and contains four test 
items on each of the six objectives in Unit 11; Test Form 125 is given 
immediately following the completion of Unit 12; etc. The relationship 
of test administrations to test forms for the unit CAM model is dis- 
played in Figure 2.1. 

The unit CAM test forms (115-2^5) each contain objectives related 
to one and only one unit. The student who responds to each of these 
unit tests during the instructional program will be tested by four items 
on all kQ objectives during Test Administrations 2-9* However, he will 
be tested on each objective during only one test administration (i.e., 
on the 'posttest^' for the unit). The six, objectives in Unit 11 are 
measured on a short-term postinstructionai basis (i.e*, at completion of 
the unit). SijTiilarly, when Test Form 125 is given to a student, the six 
objectives in Unit 12 are measured on an immediate postinstructionai 
basis. Thus, during Test Administrations 2-9, information concerning the 
student's postinstructionai achievement levels is gathered, There is no 
preinstructional (i.e., testing before the student has worked on the 
units) or retention (i.e., testing several weeks after the student has 
completed the unit) information provided by these unit levels. 



ERLC 



- 6 - 



FIGURE 2.1 

The Relationship of Test Administrations 
to Test Foms for the Unit CAM Model 



Test 

Administration Unit Completed Test Form 



L 


none 


1,2 


2 


11 


115 


3 


12 


125 


k 


13 


135 


5 


ll» 


1I45 


6 


21 


215 


7 


22 


225 


8 


23 


235 


9 


2k 


2^+5 


1.0 


all 


1,2 



The Sliding Unit CAM Model 

The unit CAM model employs only one test form per test administra- 
tion to gather detailed postinstructional information about each student. 
The unit CMA model supports some Level I decisions, but provides very 
little information for Level II decision making. The sliding unit CAM 
model uses multiple test forms during each test administration to gather 
some pre instructional and retention in addition to postinstructional in- 
formation about the students and curriculum. The multiple test forms 
allo;^ a student to take a form of the test> review objectives that he 
failed to learn, and retake a different form of the same test. The con- 
struction of these multiple test forms involves the use of stratified 
random sampling of objectives and test items to guarantee that the data 
generated by the evaluation program will be systematically related to 
the curriculum structure. Thus the sliding unit CAM models support both 
Level I and Level II decisions. 



- 7 - 



FIGURE 2,2 

T^.e Relationship of Test Forms to Test Administratic ns 
in the Sliding Unit CAM Model 



Test 

Administration Unit Completed Test Forms 



1 


none 


1, 


2 


2 


11 


111, 


112 


3 


12 


121, 


122 


k 


13 


131, 


132 




Ik 




li+2 


6 


21 


211, 


212 


7 


22 


221, 


222 


8 




231, 


232 


9 


' 2k 


2^1, 


2h2 


10 


all 


1, 


2 



Figure P.. 2 contains the relationship of test forms to test admin- 
istrations in the sliding unit CAM model. Forms 1 and 2, given at the 
beginning and end of the program, sample all the objectives in the 
course and represent a pretest/posttest component of the evaluation 
model. Forms 1 and 2 are discussed in the standard CAM model, During 
each of Test Administrations 2-9} two test forms are used that mostly 
measure the unit just completed. Figure 2.3 contains the actual test 
item numbers assigned to each question position of the sliding unit CAM 
test forms administered during the 7th test administration for each stu- 
dent, A test scheduling procedure is used to have half of the students 
repond to each forra of the test. 

In Figure 2.3 notice that 3 the items on a form are related to 
Unit 21 (the last unit the student completed), I8 of the items are re- 
lated to Unit 22 (the unit just completed) , and the remaining 3 items 
are related to Unit 23 (the next unit to be worked on). 

At tliis point a distinction needs to be made between a test form 
and a test, A test form is defined as r^veral test items arranged in 
order, A tejt can be th^ Ui^^ht of as the set (meaning r ne or more) of 



FIGURE 2.3 

Itioi Aaslgn«d to Each Question Position of the 
Slidiog Unit CAM Seat Foria* Used After Unit 22 





Question 


Iteas Assigned 


to Each Porta 


Position 


221 


222 


1 

> 


210101 


510502 


2 






% 






A 


220101 


5901rt5 






5501 OA 




22010S 


55010A 

A aV/JLVJO 


? 


220202 


550501 


o 




5505rt1 • 


O 




55n5nK 


A V 




55rt1rt5 




220101 


55010A 

A AU JU^ 


15 


A AU JU<^ 


550inA 




220iO? 




14 


220404 


220403 


?.5 


220406 


220405 


16 


220501 


220502 


'.7 


220503 


220504 


58 


220505 


220506 


19 


220602 


220601 


20 


220604 


220603 


21 


220606 


220605 


22 


230103 


230206 


23 


230405 


230304 


24 


230501 


230602 



teitt A nn;) that ai*e related to the same carrlculLmi content. A formal 
definition of ter.t li> j.*ro3onted Later in the report. The test displayed 



in Figure 2.3 contains two test fonaG and 2??) each containing 2k 

test items. The teclinique of assigning test items from Units 21, 22, 
and 23 to Test Forms 221 and 222 is called stratified random sampling. 
Details concei^nin^^ thii-: and other sampling techniques can be found in 
Gorth (6). 

Thus far, only the Unit 22 test has been examined. The tests re- 
lated to the other seven units are similarly constructed. The tests 
related to each unit contain ^^8 item^s (t-vo forms with 2k items each). 
Six of the UQ items are used to measure the last unit completed, 36 
items are used to measure the unit l>eing completed, and six items are 
used to measure the unit to be attempted next. The actual construction 
of each of these forms is similar to the specifications of Forme 221 and 
222 shown in Figure 2.3. 

There are advantages to using more than one form of a test for 
both Level I and Level II decision making. As stated above, a student 
can respond to a different form of the test to measure the success of 
his additional work. If only one form of a sliding unit CAM test were 
used then only three objectives in each unit could be measured on a pre- 
instruction and retention basis. The use of two forms enables all ob- 
jectives to be pretested and checked for retention achievement levels. 
This increase in information is important for curriculum revision and 
item rewriting decision (i.e., Level II decision making), 

The Standard CAM Model 

The standard CAl^ model is an evaluation technique for generating 
good information for Level II decision making. Correspondingly, the 
information generated for Level I decision making is not as good as that 
generated by the unit and sliding unit CAM models. Extenstions of the 
standard CAM model are applicable for Level III decision making evalua- 
tion designs (8). The standard CAM model for the hypotheticfl crjurse con- 
sists of ten comprehensive interchangeable forms containing 2' Items 
each. These I'orms are comprehensive in the sense that each une uni- 
formly covers objectives in all eight units, and are interchangeable in 
that they are ten different forms of a 2U-item final examination for 
the course. The items on each of the test forms are presented in Figure 
2.U. Note that the forms are niimbered 1 to 10. Each of these forms 
contains three items related to eac}^ of the eight units^ and every item 
on a form is related to a different objective. Stratified random 
sampling was used to first assign the objectives to the question posi- 
tions. The stratification process guarantees that three items per form 
are related to each of the eight units. Item sampling was then used to 
select the actual test itd^ns to be assigned to each question position, 

At the beginning of the course (Test Administration l), each student 
responds to one of these ten forms ^ and each form is responded to by 
2k students in the course. During the second test administration, 



- 10 - 



FIGURE 2,4 



Itttii As«lj»n<^ to Each Question Position 
of Etch of eh« Stf^ndard CAM Test Foroi 





QuMtloo 
Position 


I 


2 


3 


Itmut Assigned to Bsch Porn 
4 5 6 7 


8 


9 


10 



I 


U0106 


i « A ^ A1 

110201 


.1 t A1 A 

110102 


t i A ^ A ft 

110205 


i \ A1 Al. 

110204 


1 4 A€ A*) 

110103 


110104 


4 4 AAA4 

110203 


4 4 A A A ^ 

110206 


4 4 A 4 A 0 

110105 


7 


^ ^ AAA'S 

110403 


1 V A1 A^ 

110306 


4 4 A 4^ A A 

110302 


« « A 1 A^ 

110406 


% ^ A<k A^ 

110301 


% ^ A A. A F 

110405 


1 t AAAtf 

110305 


1 4 A J) A4 

110401 


4 4 A A A A 

110304 


4 4 A # A A 

110402 


3 . 


« « AC Ai^ 

110506 


i i A^ A C 

110605 


4 1 A^ Al. 

110604 


^ t Atf A1 

110501 


^ ^ Ai^ A4% 

110603 


^ ^ A A 1 

110504 


4 4 A # A A 

110502 


4 4 A # A A 

110602 


4 4 A i* A A 

110503 


4 4 A ^ A 4 

110601 


4 


1 S| A't A 1 

120202 


^ A ^ A i 

12Q1Q1 


« A A A A A 

120203 


i ^ A1 A K 

120105 


1 AAt A 1. 

120104 


% A AAA A. 

120204 


« A A^ A 4 

120201 


4 A A 4 A4 

120103 


4 AAAA^ 

120206 


1 AA 4 A A 

120102 


5 


1 1 /X'^ A<^ 

120303 


% OA 1 A^ 

120406 


1 A A*) A^ 

120306 


1 O Ai. A A 

120402 


120401 


« AA A Ae 

120305 


i ^ A A A e 

120405 


« A A A A4 

120301 


4 AA/ A 1 

120404 


4 AAA A A 

120302 


6 


120505 


« A A^ /\ 1^ 

120605 


^ A/WAV 

120501 


4 A / Al 

120604 


1 ^ A CA^ 

120506 


4 A A ^ A A 

120603 


4 A A ^ A A 

120602 


4 A A i* A A 

120503 


4 A A ^ A^ 

120606 


4 A A ^ A J 

120504 


7 


130106 


4 A A A 

130202 


^ < i A4 A A 

U0103 


4 AAAA4 

130201 


4 A A A A 19 

130205 


% <% A4 A 1. 

130104 


4 A A A A 1 

130204 


4 A A4 A V 

130105 


4 A A A A^ 

130206 


4 AA4 A4 

130101 


8 


130404 


4 A A A A # 

130306 


V A A < A 

130405 


4 A A A A A 

130303 


■ M /s A A A 

130302 


t A A 1 A t 

130401 


4 A A 1 A A 

130402 


4 A A A A 4 

130301 


4 A A # A A 

130403 


4 A A A A C 

130305 


9 


130501 


« A AX A< 

130601 


130502 


4 A A ^ A ^ 

13Q606 


4 A A IP A A 

130503 


t A A^ A # 

130604 


t A A # A 4 

130504 


4 A A ^ A A 

130603 


< A A^ A A 

130602 


4 A A IP A tf 

130505 


10 


140103 


4 J A A A A 

140203 


« f A 4 A 1 

140104 


4 A A A A ^ 

140206 


• 1 A A A« 

140201 


4 i A« A 

140105 


4 # AAA ^ 

140205 


4 1 A 4 A ^ 

140106 


4 / A A A A 

140204 


4 A A 4 A A 

140102 


11 


140405 


« J A A A to 

140305 


« i A / A 4 

140401 


4 /A A A 1 

140304 


% 1 A A A A 

140303 


V / A / A A 

140402 


4 i A A A A 

140302 


4 / A i A # 

140406 


4 / A A A4 

140301 


4 J A 1 A J 

140404 


12 


140502 


4 / A ^ A A 

140602 


4 1 A f A A 

140503 


« 1 A ^ A # 

140601 


4 ^ A ^ A / 

140506 


4 # A ^ A ^ 

140606 


4 1 A # A V 

140505 


4 J A ^ A i 

140604 


4 1 A A4 

140501 


4 i A^ A A 

140603 


13 


210104 


A • A A A ^ 

210206 


A ■ A A A A 

210203 


A4 A* A# 

210106 


At A 4 A4 

210101 


At A A A A 

210202 


A4 A 4 A A 

210102 


A 4 A A A4 

210201 


A 4 A 4 A A 

210103 


A 4 A A A 

210205 


14 


210301 


210401 


210405 


210302 


210404 


210303 


210306 


210406 


210403 


210304 


IS 


210603 


210503 


210604 


210502 


210605 


210501 


210606 


210505 


210602 


210504 


16 


220105 


220205 


220105 


220Z04 


220203 


220102 


220103 


220206 


220104 


220201 


17 


220302 


220402 


220306 


220401 


220304 


220405 


220406 


220305 


220403 


220301 


18 


220604 


220504 


220503 


220605 


220601 


220502 


220602 


220501 


220506 


220603 


19 


230201 


230U6 


230202 


230104 


230203 


230103 


230206 


230102 


230205 


230101 


20 


230302 


230402 


230401 


230303 


230304 


230405 


230305 


230404 


230301 


230406 


21 


230504 


230604 


230505 


230603 


230501 


230602 


230606 


230506 


230605 


230503 


22 


240101 


240201 


240102 


240205 


240206 


240208 


24020S 


24O20a 


240202 


240105 


23 


240403 


240303 


240302 


240404 


240307. 


240405 


240306 


240401 


240304 


240402 


24 


240505 


240605 


240506 


240604 


2'*0502 


240603 


240503 


240606 


240601 


240504 



. II - 



ERIC 



the yrccoije. rej^eated^ but each student respond to a different test 
form. At the end of tlic instructional progi-ani, each student has 
responded once and only once to each of the ten test forms. Details 
concerning t\\i^ ^^chedu l In^?; process can be fcund in Gorth (8) and Pinskv 
(9). 

Con^iider a student resp;.nding to a standard CAM test form following 
oornplt-tion of Unit ?? (i.e., Test Administration 7). There are only 3 
test items related to Unit 22 on each of these forms. Therefore, the 
standai^d CAt^ test docs not provide information for deciding if the stu- 
dent has learned Unit ?2 and should move on to Unit 23. Level I 
decisions are not supported by the standard CAM model. On the other hand, 
when the data from this standard CAM modal is summed over all the students 
in the course, it provides excellent data for Level II decision making. 
One can examine each objective in the course for input (preinstruction) 
and output (retention)' achievement levels. The interaction of learning 
one objective upon the achievement levels of other objectives can be 
extimined (for more detail see Gorth (9)). 

Generalized P^valuation Concepts 

Set - One or more. For example, a set of objectives is one or more 
objectives; a set of forms is one or more forms; a set of tests is one 
or more tests. 

Content Span - A collection of ordered objectives specified in terms 
of the first and last objectives in the collection. In most instances 
the ordering of objectives is defined by the order in which they are 
taught. The name given the collection is related to the portion of the 
curriculum covered by the objectives in terms of content in text or time. 
For example, in the hypothetical curriculum, the content span contained 
Objectives iICl-1106 is Unit 11; the content span contained Objectives 
IIOI-2U06 is the entire curriculum. 

Test Fo rm - A collection of items in an order that is presented to 
the .stadento. The term "form'^ is an accepted short version of the term 
"test f oi'm \ Since each item is associated with an objective, the item 
nuiTibers represent a content span for the test form. 

Test - A set of forms that contains all test forms with the same 
ccntent span. 

Objective Density ~ The proportion of items related to an objective 
on a test or on a test form. The denominator of the objective density 

the total number of items on the test or test form; the numerator is 
the riumber items related to the specific objective on the test or test 
form. 

Evaluation Period - A set of test administrations. 

Standard Evaluation Component - A test consisting of more than one 
form that is used for more than one test administration during the eval- 
uatii n period. 

Sliding Unit Evaluation CoQiponent - A set of tests such that the 
content span of each test contains one or more objectives frcm the con- 
tent cpan of the test used in the immediatea.y proceeding < r the immedi- 
ately following te.^t ad/ninistration. Each test is used only once in an 
oval uation : or i d . 



- 12 ^ 



Kval u-^n i; )^ D> r - A set all Iho evaluatlcn ^^ mj v nent^ dei^i^gied 
l\n' ail cvaluHtlon ioru^.l. 

The unit CM model iy an evaluation doci/Ti v/ith tvo C(,)mionents - 
a ^tan^iard component daring Test Administrations; i and Wy and a unit 
L*..ia^vJiont during Te:jt A Imlnlotratl n;^ 2-9. The i^tandard component c<.'n- 
^'iats of FoViiii^ I and 2 in. Figure :\h. This standard CAM test is based 
Ufun the content span v)f the entire cian^iculuraj i.e.^ Objectives 13.01- 
'^hOb, Each of those < b.lectlves ap^ei^rs once on the testy and the objec- 
tive density in the test is 1/^8 for each of the objectives ► The 
unit co(aponent tf the unit model consis^ts of eight tests, each related 
to one of the eight unite in the curriculum. The content span of the 
Unit 22 test (l»o», Form 235) is Objectives 220l-2206j the content sjan 
of the Unit 23 to^^.t (l.o,, Form 23")) is Objectivco 2301-2306, and the 
content spaa of the Unit 2^ te.t {i.e., Form 2k^i) is Objectives 2^01- 
2V0b« Note that the content span of these tests do not overlap, i.e*, 
contain the same objectives. 

The sliding unit CAM model is an evaluation design consisting of 
two components --a standard component in Test Administrations 1 and 10, 
and a sliding unit component in Test Administrations 2-9* The standard 
component of the sliding unit model is Identical to the standard compo- 
nent of the unit model. 

The sliding unit component consists of eight tests^ one test re- 
lated to each of the eight units. The test for Unit 22 has a content 
s^an of Objectives 2101-2306 (refer to Figure 2,3}. This test contains 
kh items 5 2k items per form. Objectives 2101-2106 each appear only once 
on the test and have an objective density of 1/^8; Objectives 2201-2206 
each appear six times on the test and have an objective density of 6/kQ; 
Objectives 2301-2306 each appear once on the test and have an objective 
density of l/hQ. The test for Unit 21 (not presented in the report) has 
a content span of Objectives 1^401-2206- Objectives l^lOl-lUoS each appear 
only once on the test and have an objective density of l/h&i Objectives 
2101-2106 each appear six times on the test and have an objective density 
of o/kQ; Objectives 2201-2206 each appear once on the test and have an 
cbiective density of l/U8 on the test. 

The standard CA.M model is an evaluation design consisting of a single 
^te^niard evaluatiLn coraponent* The standard test (refer to Figure 2^0 
conw^ists of ten test forms defined over the entire curriculm content 
span (Objectives 1101 thvough 2'i06} . The content span c-^^ntains ^8 objectives, 
and each objective appears five tinies on the test. Thus the objective 
density is :>/2h0 f ^. r each objective in the curriculum* 

A Comparis-. n of trie Three Svaluatii:n Model c 

Level I Decisi :.nG - The unit CAiM model provides the most information 
fo-r r.iaking decisi::no conceniin-;^ an individual student's mastery of objec- 
tives '^'U an. iinmediate po^^t instruction bacio. After eacl; unit is GOz?i- 
pletei tae unit model generates Cvjv respcn^oo t^ e'u:n -f tne last six 
ubje^'tivei: oompleted. Ho/zcver, the unit CAM model d' n- ^t provide any 
information ^Micernin;; trie student's preinstruction or retonti^jn achieve- 
fnoiit Lov^^l:- ..r: the vb.iootiven In ti^e c- uroo. 

- 13 - 



ERIC 



ERIC 



A«bl«vm«nt on objiccivt 2m by l'««t Admlolitratlon 



1 a S 4 5 6 7 B 9 10 

R«<:>oauoa '"^ l-*^ ^'"^ ^^O xno 120 120 

Standard xoJal R«t»o^« ^^X lOX lOX lOX lOX lOX lOX lOX lOX lOX 



Total Responaea 

Number of 

I'tuiiVj L'i]ftd 



9 5 5 5 3 5 5 5 5 5 



TXZ.t 120 0 0 0 0 120 720 120 0 120 

"nlf"^ "^"iZ^.. '''' OX OX OX OX lOX 60X lOX OX lOX 



Nuabet ol! 



0 0 16 10 1 



io»l"tM'^M„ "» M <« « »M t"! W lOX 

tiutfibar of , ^ ^ * ^ ^ . 

Uea. U«ed 1 n 0 0 . 0 0 /I 0 0 1 



- Ih 



'Hje i'li llng unit CAM model i>rcvirlOf5 very good information for mastery 
decituon making relative to the individual student. ThlvS model generates 
three vo^oixi^es to each oT the last six objectives completed. In addi- 
ti n, tJie ivliding unit model pi'ovides a sampling of preinstructlon and 
retention achievement level:i for the individual student. In comparison 
v/ith the unit model, the sliding unit model sacrifices some reliability 
concerning the immediate postinstruction mastery decisions (three items 
per objective rather than four) in order to gain some information con- 
cerning the student *s preinstructlon and retention achievement levels, 

The standard CAM model provides very little information concerning 
the student 'ii mastery of objectives. During each test administration, 
the standai^d model generates at most one response to the objectives in 
the course for each student. However ^ this model does provide an esti- 
mate of the student's preinstructlon and retention achievement levels 
across all objectives every two weeks. 

Level II Decisions - The unit CAM model provides very little infor- 
mation for Level II decision making. During Test Administrations 2-9, the 
xnodel only provides group information about the latest six objectives 
completed. There is no information con<;ierning the student's preinstruc- 
ticn and retention achievement levels on the other objectives in the 
curriculum. 

The sliding unit CAM model provides more information about groups 
of students than the unit model. In addition to the group achievement 
level on the latest six objectives completed, the model generates esti- 
mates of group achievement on the previously completed six objectives 
and on the six objectives to be studied next. No information is provided 
concerning objectives completed more than one unit previously, nor con- 
cerning objectives to be studied after the next unit. 

The standard CAM model provides information concerning the student's 
achievement on all h8 objectives follcr^ing each test administration. 
Thus one is able to measure preinstructlon, postinstruction, and reten- 
tion achievement on all objectives in the course^ and is able to measure 
the interaction effect of studying one objective upon the achievement 
levels of other objectives. 

The information for groups of students generated by the three models 
is rumriariaed in Figures 2.5 and 2.6. Figure 2.5 contains an analysis of 
the informati; n about Objective 2201 generated by the three models in 
each test administration. Number of Responses refers to the number of 
pcsr^ible student respcnser. in each test administration to items related 
to Objective 2201, The hypothetical curriculum structure and evaluation 
models.' v?ere designed such that each objective is responded to 1200 times 
by i^tivionts during the ircgram. It is the distrib'^tion of these 1200 
reGvoHees over the toct admini strativ n:-: tb.at difforr from model to model. 
Thi:3 distribution lo ^^ivcn by th*^ w Pcr^ent^s;^^^ of T tvU R^aspons is. The 
IJiJinber Items Us-d rof-rs to the d.^gree of Vt rn iuij^iipllng tliat is used 
in each model. For in.stance^ in Tost Administration 1, all models pro- 



nGORE 2.6 



Analyst* of Rtsponset to Oueatlon* Me^aurlQ^ 
AehUvetneot on Objective 2201 by Tine Reference 



Model 



Statietlc 



rreioBt ruction Post Instruction Retention 



Steodard 



Nuaber of 
Responses 

Percentage of 
Total Responses 



720 
60Z 



240 
202 



240 
20Z 



Sliding 
Unit 



Nuober of 
Responses 

Percentage of 
Total Responses 



240 
20X 



840 
7 OX 



120 
10% 



o( 120 960 120 

Responses 

Unit 

P«cenuge of qqj 
Total Responafid 



Hotes PREINSTRVJCTT.ON Student rospondft before instruction on 

Objective 2201 

POSTIKSTKUCTIOJI - Student response to Objective 2201 during 

Test Administration 7 ^nd 8. 

RETKKTION *» Student response to Ob jectlve 2201 during - 

Test Administration 9 and 10. 



duoo i;-u :'.t.u.i*'nt, r'oj?ionses to Ob,!octive S201» However, the standard 
iHwdol cu'Cfc' rive itcruo (ton form^ are used), while tho sliding unit and 
unit m^^delB vu*;e only one item each (only two forms are used) » Remember 
that ?in cbjective only appears on every other form in the standard eval- 
uati^_nC(aponent» Thi^: flgurr displays the fact that the unit and sliding 
unit modelc5 generate mare poi: cinstructicn information^ while the standard 
model generates more preins tract ic>n and retention information* Remember 
that each ^:tudont completes instruction on Objective 2201 prior to Test 
Administration 7« 

Figure 2.6 contains an analysis by time reference of the information 
about Objective 2201 generated by the three models. The student responses 
are broken down into PRE INSTRUCT ION (a response before instruction on the 
objective), POST INSTRUCTION (a response to the objective during Test Admin- 
istrations 7 and 8), and RETENTION (a response to the objective during 
Test Administrations 9 and 10). Note that as in Figure 2.5, each model 
generates 1200 student responses throughout the program. It is the dis- 
tribution of responses over the time references (Percentage of Total 
RCi^ponses) that changes from model to model. 

The power of the standard CAM model for Level II decision making can 
be seen if one considers the question as to the proper sequencing of ob- 
jectives in the curriculum structure. By providing estimates of the 
class's achievement level ten times during the course (i.e., a longitu- 
dinal achievement measure), the standard model enables the teacher to 
recognize interactive instructional effects. Suppose that the instruc- 
tional activities related to Objective 1203 also affect the achievement 
level on Objective 2201. Instruction on Objective 1203 during Test Ad- 
ministrations 2 and 3, bat also provides an estimate of the students' 
achievement on Objective 2201 during these test administrations (see 
Figure 2.5). If taere is a significant change in the achievement level 
on Objective 2201 from Test Administration 2 to Test Administration 3, 
the coarse structure might be resequenced the following year to include 
Objectives 1203 and 2201 in the same unit. An analysis of Figure 2.5 
shews that the sliding unit and unit models generate virtually no long- 
itudinal data. 

Consider an input-output analysis of the effectiveness of the hypo- 
thetical course structure. Input is taken to mean the students ' prein 
struction achievement level, and output is taken to mean th<:^ students' 
retention achievement level. Retention is being used as the output mea- 
sure because postinstruction achievement levels sometimes contain tran- 
sient achieveuent such as rote memory. An analysis of Figure 2*6 indicates 
that the standard model generates 80fo of the student responses on a pre- 
instruction and retention basis , the sliding unit model 30^, and the unit 
model 2&'b on a preinstruction and retention basis. Thus, the standard 
model generates data that is more useful for an input-output analysis of 
a course . 



.V I'fi'" Aj.j li 'ati,!! . f Comiuter-A^isisted Instruction to Criterion-Referenced 



TriC cri tori . n-reff?renoed *.,of;ting (CRT) evaluaticn designs discussed 
in tVie i.revious secti >n can be classified as static testing. The tests 
are constructed befo re the students enter the jrogrraD, The students are 
mea:mrod by these ter>ta that cannot account for individual differences. 
Dynamic testing, on the othf;r hand, would be able to construct each tost 
based upon a student ^s past history. This type of testing would undoubt- 
edly involve tlie use of a computer to print out individualized tests 
bailed upon the student's performance history that is stored in the com- 
puter. Computer-assisted instruction (CAl) uses dynamic testing in that 
it prints out te^t items or exercises based upon t^ie student's level in 
the program. Of course. CAT does this testing on-line, and by providing 
immediate feedback on each item serves as an instructional as well as 
a testing mechanism, 

A common structure for CAI is the strands structure that is used 
at Stanford University. For example^ the elementary mathematics curriculum 
structure developed by Patrick Suppes contains 15 strands. Each strand 
includes all problem types of a given concept (e.g., fractions^ equations) 
or of a major subtype of a concapt (e.g. horizontal addition, vertical 
multiplication) presented in grades one through six. Within each strand, 
problems of a homogeneous type (e.g., all horizontal addition problems 
with a siun from zero to five) are grouped into equivalence classes* Each 
strand contains either five or ten classes per half-year with each class 
labeled in terms of a gr:; ^e-placement equivalent. 

A student is working on one equivalence class in each strand. The 
equivalence classes are structured in an increasing order of difficulty 
within each strand. Thus the student works on a given class until he 
passes a criteria after which he moves up to the next class. There are 
review exercises within a strand that the student must successfully re- 
spond to. Failure to correctly answer these review exercises can result 
in his being lowered a few equivalence classes . During each session at 
the computer terminal, the student responds to exercises from several 
strands. The emphasis placed on each strand depends upon the student's 
approximate grade placement, and upon his distribution of equivalence 
classes across the strands. A student will tend to receive more items 
on the strands where he is in the lower equivalence classes. 

The major drawback to CAI has been the cost of having the students 
responding on- linu to a computer. This report explores the possibility 
of using the CAI curriculum structure of strands and equivalences, printing 
the tests on the computer, and having the students respond to the tests off- 
line, i.e., at their desk in the classroom. There are many problems to this 
concept, and the report discusses these problems that need to be explored 
with more field-oriented research. 

The equivalence classes within the strands structure are analogous 
to the performance objectivo? that are required in CRT. Many levels of 



J ojM'vMMii'uii'^' llvo;; liavo been doflned in the field of CRT. The equiv- 

aieiK'O ^'Ia.;.;tj.' m ajpruximato eriabling ^'bjoGtive^? are defined by O'Reilly 
(lO), Trio ,::onorati>. n . f the to.^:t item;:; can bocoiae a .::eriouii problem. In 
tiie mathematics CAI strands prcip^am, tlio test items are produced by item 
;!;etieratio]iiu^> r L tLr:\^ . Tlui^^ the te^t items them^;elves are not i:^; red in 
trie computer. Attempt;^ wv. .k*velop item ^^eneraticn algorithms in other 
iiubject matter areas have not been very auccesiiful. One of the major 
research area^^r in CHT today is producing useful item generation rules in 
a variety of cuibject matter area.s (ll). 

An alternative to item generation rules is to physically store thou- 
sand.^ of test items in disc storage and actually retrieve the items when 
required. Robert 0 'Reilly of the New York State Education Dept. has tried 
this techniqi.ie for reading grades ^-6 and encountered serious problems, 
First, the computer software development costs were very high. The soft- 
ware includes adequate editing capabilities for correcting and modifying the 
item da^a base during the first year of operation. The data enty costs 
were extremely expensive. O'Reilly wanted to use upper-lower case characters 
and decided to enter his data via an optical character reading (OCR) machine, 
Problems were encountered when attempting to maintain quality control on the 
test item data base. Additional problems arose when test items required special 
symbols or diagrams. Computerized microfilm has the potential for solving 
many of the problems associated with maintaing a computerized test item bank. 
However, using today's technology , the microfilm technique is too slow and 
expensive. 

Another problem tiiat arises in attempting to apply CAT techniques to 
CRT is modifying the decision rules for moving a student through the 
equivalence classes or the analogous objectives, A student in the CAI 
program responds to appr ox innately 50 test items every day. A student 
in a CRT program might respond to 50 test items per week. Thus there is 
only 2Q.A> as much information in the CRT program as in the CAI program. 
The ability to make decisions regarding changes in equivalence 
Glasses in each of the 15 strands following every test administration for 
a si.udent bcccmos questionable. One runs into a classical bandwidth- 
fidlity measurement dilemma (12). As the ajuount of data decreases, the 
reliability of the decision.^ that are made decreases. 

In the above paragraph, it is stated that a CRT program might gen- 
erate responses to 50 test items per week (say one 50 item test per week). 
The reader might wonder why the CRT program could not include a test 
every day for each student. The problem lies in the cost factor. The 
two mciTt expensive aspects of data processing today are input and out- 
put. The cost of producing a 50'-item test for each student once a day 
V io expensive. And the cost of entering student responses on a daily 
basis can beeciae prohibitive . In addition, there is the cost (sometimes 
hiidden) of administering these paper and pencil tests. The cost of on- 
line ccmputer power is decreasing very rapidly these dsy, much faster 
than the cost of input-output devi^'e^:. The cost of huiian clerical help 
is increasing year after yaar. Therefore, any attempts to monitor student 
progress on a daily basis might best be done using an t-n-line testing en- 
virc^iiinont . 



h. Eartii ocienotf Criterion-Referenced Testing Results 
Background 

During the fall vi' .1972 an experiment was ocndacted to determine the 
effectiveness of various uses of criterion-referenced testing and tlie 
ability of students to make their own instructional decisions based upon 
the CRT computerized uutput* The course used in the experiment was a 9th 
grade earth science course at San Carlos High School in San Carlos, Cal- 
ifornia. The course was under the leadership of Larry Wagner at San Carlos 
High, The experiment was conducted with the cooperation of John Easter, 
Director of Project CAM, Sequoia Union High School District, Redwood City, 
California (13). There were originally four classes totalling 120 students 
involving two teachers in the study. However/one teacher left the school 
during the year. Thus the data presented here represents three classes, 
eighty-five students and one teacher. All results presented are based 
upon a small sample size and must be consideied as tentative. 

Each student at San Carlos High must complete one year of a science 
course before graduating, In the school, the college-bound students tend 
to take life science courses. The earth science prograjn attracks a wide 
variety of ability and motivational level students. Data is presented 
concerning the background of the students in the program. 

IXiring the siunmers of 1969 and 1970, Wagner developed individualized 
study packets containing performance objectives, learning activities, 
self-tests, and posttests. During the 1970-71 school year students were 
free to select packets within each of the earth science content areas of 
astrcncmy, geology, meteorology, and oceanology. Each content area lasted 
one quarter. Based upon student feedback and CAM data, Wagner decided 
to modify the course design and change the £:valaation procedures for the 
school year 1971-72. 

The course design for 1971-72 was more traditional in that the students 
were gr^up-paced but still used the packets developed for 1969-7O. CAM 
results indicated a greater increase in student performance than in tiie 
previous year. There were still students who wanted to move independently 
and they v/ere given the option. Hasrever, the number of students working 
independently was kept to a minimum to facilitate record keeping. 

Keeping in mind that the group paced instruction data showed greater 
increases in student performance than the self -paced instruction group 
of the previous year, it was recognized that many students need to assume 
more responsibility and make more of their own decisions as to lesson 
selection and dompletion. Therefore, the format for the 72-73 school year 
included a group/self -paced combination as described below. 

Curricula Structure 



The astronomy section of the earth science course is composed of 88 
instructional objectives v/hich make up 23 lessons within the six astronomy 



SAN GiRLOS EARIK SCIENCE NAME 



DATE PERIOD 

ChapUr 26 • STARS AND QALAX1B3 ' 

LESSON 6 > STELLAR EVOLUTION AND GALAXIES 



Objiotlv* Nujober INSIRUCTIONAL OBJECTIVES 

2661 • Identify the correct description and/or size of our galaxy# 

2662 • Associate the name of the 3 rnain types of galaxies to a description 

or diagrama 

2663 • Identify the density, composition and origin of the great gas and dxwt 

Cloiid^ of interstellar space* 

266ti «• Be able to identify characteristics of each stage in the life history 
of a star in tertna of temperature, color, and size of the star, and 
relate the process to our sun» 

2665 • Be able to select the correct explanations or sketches which stand for 

the following origins of the universei 
^ a« Ebqpanding universe theory 

b* Steady-state theory 

ACTIVITY OBJECTIVES 

2666 m Complete the study guide on STELLAR EVOLUTION AND QALAXIES using your 
(5) text and other books as references, 

2667 • UBORATORY ACTtVITY - CCMPARINO THE SUN WITH OIKER STARS. Perfom the 
(5) actlTlty as described in the handout with the sa»e name* 

2668 * UBCRATORY ACTIVITY - INVESTIGATING GALAXIES. Perfom the activity as 
(5) described in tjie handout with the same name # 

2669 • ESCP mDINGS FOR (HEATER UNDERSTANDING 
(20) 

a# STELLAR r/OLUTION (10) • Read pages $36-$U3 and iiaswer questions 

1-5 page Shh* 

bt WE LIVE ip a QAIAXY ($) - Read pages 5hli-$U7 and answer queetions 

1-U page $U7. ^ 

c. OOR QALAXY AMONG ITS NEIGHBORS - Read pages 5h7-5U9 ?.nd answer 

(5) questions 1-U pag^a SU9* 



ehaj tore. Ihe t,e:;l. There are 2-7 iessr;ns per chapter with an average 
of 3.8 '.bjectivoo per lesson (from 2-!;> objectives per less n) . Exaraples 
of the objectives are shov/n in Figure U.l. The lessons are also made 
up of several activity objectives as shown in Figure ^.1. 

The students were divided into three groups for the study^ each 
gr. up corresponding to a class section. The course was group-paced in 
terms of what chapters the students were studying, but was individually 
paced in terms of lessons (order and number) within the chapter, The 
majci^ differences between the three groups were: 

Group 1 - Tno teacher, based upon results of the CAM and 

Dubins Earth Science test, decided which lessons 
the student would study, and the lesson or lessons 
he had completed. 

Gruup 2 - The student decided which lessons he would study 
and when he had completed a lesson, 

Group 3 - The teacher, based upon results of the CAM and 

Dubins Earth Science test, decided which lessons 
the student would study, but the student decided 
when he had completed a lesson. 

In all instances where a student declared that he had completed a 
lesson, he received positive credit for completing objectives within a 
lesson on v/hich he answered the test items correctly. If he got these 
items wrong, he lost credit* The loss of credit policy was imple- 
mented to reduce the number of students who would have declared a les* 
son completed hoping that they would have gotten items correct by chance 
and thus recieved credit for objectives completed. In addition, a stu- 
dent was allcj^^ed to repeat any test (a different form)/ However, the 
score on the second test replaced the score on the first test even if the 
second score was lower. Thus, the student hopefully was mutivated to 
study additional material if he chose to retake a test. 

Evaluation Design 

The evaluation design for the study consisted of CAM testing; two 
administrations of the Dubi-.is Earth Science Test, one at the beginning 
and one at the end of the course; and two administrations of a student 
questionnaire, again once at the beginning and once at the end of the 
course. In addition, the results of the CTBS reading and mathematics 
tests for the students were obtained from the district records . 

The CAM evaluation design consisted of a standard CAM with two forms 
given on Test Administrations 1 and 8, and sliding unit CAMS given during 
Test Administrations 2-7. The two forms of the standard CAM were made up 
of questions randomly selected from the astronomy bank of items so that 
all lessons were sainpled. Each student received the form during Test 
Administration 8 that he did not receive during Test Administration 1. 
Tiic Gliding, unit CAI4 contained two items per lesson on lessons already 



ccul] L**t.^f I; I'lvn itumi: i^ov lesson cti the lessons just studied; and two 
itettu^ tor looiicii on tlio lesL^on;^ to be studied nexti The sliding unit 
CA.Mv^ vero alministei^ed every two to three weeks. Each test consisted of 
two foj^mn i^o that one-half of the class received each form. If a student 
ch". ^■•j to retaKe a tost; he took the alternative furm. 

The Dub ins Earth Science Test consists of two forms (A and B) each 
containing 60 items. Form A was given in September, Form B given in 
Januai^y. The test is divided into four content areas, Geology, Astro- 
nomy, Meteorology, and Oceanography ; and three content distributions, 
knowledge, understanding, and application. There are 31 items related 
to Astronanyj and hj knowledge items, 33 understanding items, and 50 
application items on the test. 

The student questionnaire consisted of two 30-item forms, Both 
forms were used at the early October and mid-December administrations of 
the questionnaire. The students who took Form 1 during October, 
respv nded to Form 2 during December and vice-versa. The questionnaire 
consisted of statements that the student was asked to agree or disagree 
with on a five point scale. Several items were worded negatively to 
increase the validity of the instrument. All data were processed so 
that a response of 1 indicated the most positive agreement and a response 
of 5 indicated the most positive disagreement. All items on the ques- 
tionnaire were divided into six categories. These were (l) attitude on 
content and a-ttivities; (2) attitude on decision making; (3) test 
anxiety; (k) course anxiety; (5) self -concept ; and (6) use of CAM data. 
The questionnaire items were constructed and catagorized by members of 
the High School District. 

Operating Procedures 

A major problem encountered in the study was how to convert the stu- 
dent decisions concerning lesson completion to a computer readable for- 
mat. The computer output shown in Figure U, 2 was designed to overcome 
this problem. Before the student responded to a test form, he was given 
a computerized lesson completion summary sheet, This sheet contains the 
date of lesson completion (a blank means that he has not completed the 
lessen), his preinstruction and postinstruction scores on the lesson. 
He circled those lessons that he had recently completed* The sheet was 
sent to keypunching for input into the computer. 

While the above technique worked well for the 85 students in the 
study, it would prove quite expensive on a large scale basis. Other 
techniques of entering lesson completion data need to be developed for 
individually paced instructional programs. 

Results . ■ 

Figures ^,3 and ^4 .^contain summarized results of the data collected 
in the study. All data (except the CTBS scores) were generated by the 
GAM2 cc-mputer software run on a Hewlett-Packard 2120 ccmput^r system at 
the oequoia Union High School District Central Office. Figure U.3 



FIGURE 

Leason Completion Suomary 

KRAMER SCOTT L 105981 MACNER PERIOD 2 

Wn^m SC202 

. DATE PRE-INS POST-INS 
LESSON COMPLETED RCSP0HSB8 RESPONSES 



261 


9/22/72 


0/ 2 


7/ 


7 


262 


9/22/72 


2/ 2 


7/ 


7 


263 


9/22/72 


0/2 


6/ 


7 


264 




6/9 


' 0/ 


0 


265 


10/ 6/72 


1/4 


4/ 


7 


266 


10/ 6/72 


2/5 


6/ 


7 


267 




7/11 


■ 0/ 


0 


271 


10/25/72 


1/ 5 


7/ 


7 


272 


10/25/72 


4/4 


7/ 


7 


273 


10/25/72 


2/ 4 


6/ 


7 


274 


10/25/72 


1/4 


7/ 


7 


275 


10/25/72 


1/4 


6/ 


7 


291 


11/9/72 


2/ 4 


4/ 


7 


292 


11/ 9/72 


3/ 4 


7/ 


7 


293 


11/9/72 


0/ 4 


7/ 


7 


294 


11/ 9/72 


0/4 


6/ 


7 


301 


11/29/72 


1/4 


5/ 


5 


302 


11/29/72 


4/ 4 


5/ 


5 


303 


11/29/72 


2/ ^ 


4/ 


5 


311 




1/ 5 


0/ 


0 


312 




2/ 4 


0/ 


0 


32i 




2/ 4 


0/ 


0 


322 




2/ 5 


0/ 


0 



TOTALS 



46/102 101/113 
TOTAL POms 178 



cinl^iln. liii' roiUiLl.s ilLsplayod for all fvtudents In the course and for each 
ui' Ihe Lhr^'o puricHh;, while Figure h.h contains the data displayed by 
student grade Levels, i.e., thoise students who received an A in the 
couvise, thc^e that received a B, and those that received a C, 

Figure h.2 oc-ntain::5 the CTBS reading and [iiathematics scores of the 
students in each of the three periods. These scores are in terras of the 
national percentiles for yth grade students during the month of October, 
The results indicate that Periods I and ?, entered the course with approx- 
imately the same achievement bacjkground in reading and mathematics, while 
Period 3 was significantly lower in both these areas • The average course 
grade for Period 1 was 3.O (3,0 = B), Period 2 was 2.6 and Period 3 was 
2.^^ (2.0 - C). Moreover, the average student in Period 1 recieved ^+,1 
units of credit, the average in Period 2 was ^♦O, while the average in 
Period 3 was 3.8, Full credit for the semester's work was 5.0. 

Based upon the standard CAM test given at the beginning and end of 
the course. Period 2 had the highest entry level (32/0 correct) and the 
lowest gain in achievement. Period 1 has the highest gain (57-29=28^) 
Period 3 the second highest (51-25=26/o) , followed by Period 2 (55-32=23fo) . 
Period 2 also had the smallest gain on the astronomy portion of the Dub ins 
Earth Science test (55-^+1=24) and Period 3 (53-25=58fo) . It is inter- 
esting to note that Period 3 made the gain on the Dubins primarily in the 
understanding and application components of the test. 

The student questionnaire data shows little differences between the 
three periods. Periods 2 and 3 become less positive toward the course at 
the end of the semester (2,5 and 2*6 and 2*5 and 2.8), Period 3 students 
responded slightly negatively (3,2) to test anxiety and use of CAM data 
questions during the second administration of the questionnaire. 

Figure h*k contains the results displayed by course grade level. 
As would be expected, the CAM results are much higher for the A students 
{7Vi postinstruction) than for the C students (k3io postinstruction) . 
Ho'wever, the gain in achievement on the astronoray part of the Dubins test 
were approximately the same for all three grade levels. This gain seems 
to have been made mostly on the understanding and application coraponents 
of Dub in, especially for the C students. 

The questionnaire results tend to shov; that the A and B students 
thought more favorably of the course at the end than did the C students. 
This difference ap^pears to be spread over all categories of questions. 

Discussion of the Results 

There are several problems with interpreting the results of this 
study/ The first is the sample size. Much of the data presented in 
Figures ^V3 and hA\- are based upon small enough saxnple sizes to create 
doubt about any statistical significance between pairs of values. One 
must rather look at trends in the data over several raeasures . Secondly, 
the study was conducted in a real life high school environment* The 



Rci::udt>2 Displayed by Cuurso Periods 



Student group 



lut^trufient 


All Students 


Period 1 


period 


2 


Period 


3 


CTBS Readln^i'^ 


53 




57 




56 




I46 




CTB3 Mathematics^ 






50 




i*8 




32 




Course Grade^^ 


2.7 




3.0 


2.6 


2.h 




IJnit^^ of Credit 






Ik 


1 


h.O 








N.jfriher of .Students 


85 




30 




30 




25 






Pre Post 


Pre 


post 


Pre Post 


Pre Post 


Standard CAM^ 


PQ 




29 


57 


32 


55 


25 


51 


Dubins Total^ 


35 


h3 


37 


h3 


36 


kl 


30 


h3 


Dub ins Ai^tronotny-^ 


36 


56 


38 


59 


kl 


55 


25 


53 


D' lb ins Kric\* 1 sdef^-^ 


35 


38 


37 


i42 


36 


37 


30 


36 


Dubins Under stand ing3 


38 


U7 


U2 


50 


kh 


U3 


26 


50 


D lib ins Aivi^l icat i cii3 


33 


U5 


35 




31 


U3 


32 


k6 


Attitude on Content^ 


2.6 


2.6 


2.5 


2.5 


2.6 


2.6 


2.6 


2.7 


Attitude on Decisions^ 




2.5 




2.5 


2.I1 


2.3 


2.k 


2.6 


Test Anxiot^,'^ 


3.0 


3.0 


3.1 


2.9 


2.9 


3.0 


2.9 


3.2 


Coars'^ Anx'iei'/^ 


2.2 


2.3 


2.0 


2.2 


2.2 


2.3 


2.2 


2.5 


Self ^ concept'^ 


2.2 


2.5 


2.3 


2.k 


2.2 


2.h 


2.2 


2.5 


(Jse of CPM n^i.te^' 


2.7 


3-0 


2.7 


2.7 


2.8 


3.0 


2.6 


3.2 


Questionnaire Average** 


2.5 


2.7 


2.^ 


2.^> 


2.5 


2.6 


2.5 


2.8 



Notes: 1: expressed in national perc^nt'''les 

2: A=^U.Oj B:=3.0j c=:r:,Oj D-i.o 

3t ^^xpresf^ed ap; the percentage? of correct responses 

^4: i , Ores trongly agree, , , i , 5. 0-strongly disagree 



FlaURK U.h 
KOiJUit'o Ditipiayed by Coarse Oruile LeveJ, 



Student Groa:^ 



Number f StU'lent d 

Standard CAM^ 
Dubiiui Total i 
Pub ins Astr runny ^ 
DUb ins Know 1 edge ^ 
Dub ins Understanding^ 
Dub ins Application'^ 

Attitude C21 Content^ 

Attitude on Decisions^ 

Test Anxiety^ 

Course Anxiety^ 

Self-concept^ 

Use of cm Data^ 

viues t ionnaire Average^ 



A Students 



Pro Post 



B Students 



Pre Post 



C Students 
TTf 

Pre Post 



33 

52 
hi 



2.5 
2.k 

3.1 
2.3 
2.0 

2.3 
2.3 



71 
58 
72 

52 
66 

6o 

2.5 
2.3 
2.7 
2.2 

2.3 
2.8 

2.5 



31 
38 
39 
37 

37 

2.I4 
2.U 
3.2 
2.1 
2.U 

2.9 
2.6 



61 
1*6 
61 
I40 
1+6 
h9 

2.6 
2.1+ 
3.0 
2.2 
2.3 
2.9 
2.6 



26 
29 
29 
30 
32 
2I4 

2.8 
2.U 

2.9 
2.3 
2.2 
2.6 
2.6 



I45 
36 
I48 
32 
hi 

36 

2.8 
2.7 
3.2 
2.5 
2.7 
3.2 
2.9 



Note: 



exj-ressei as the percentage of correct responses 
L.Or;;-t.V( nojly agree, .... 5 = 0 = Rtrnng;T y disagree 



- 27 - 



ERIC 



leach*jr .va:.; n^. t. alwayi.: able to maintain the conditions of the study. 
Sunetimet^ he v;ae n. t able tc; direct tho study of each student in Period 1 
di.K; to a lack ^ f tlrne; .vhlle in Period 2> he wa.s fcirced to abandon the 
r.tuiy ' nUtiiri." t.^ v/ork witli r.ome ^^tudonts wh^' v/oiild not have passed the 
coKXVi^e ir Uift tc titeir own ;:>tudy decisicns. And thirdly, the value of 
the i'tudent que:-; ti^ nnaive as an evaluation instrument is debatable. Ex- 
poricnoe witli rther' student quciitionnaires in the Sequoia High School 
Dii>trict indi(:ato.3 that student.^ tend to be overly agreeable tov^ard the 
toactier c n t(iei:e iniitramentc » 

Doi.;] ito thc^:c! j-robJcjus, the data seem to strongly indicate that 
tiuNso 1,'urtii .scienoc students need teacher support when using CRT data 
generated by the computer. The teacher felt that the data overwhelmed the 
students who were not quantitatively oriented. However, when using the 
data witri the assistance of a teacher or paraprofessional, the students 
seemed to increase their achievement levels (this statement being based 
upon c trier studies done in past years in the same course (13))* 

The study suggests a need for additional applied research on the use 
and ef rectiveness of CRT in the classroom. One research effort would be 
to replicate the study using different subject matter areas and different 
i^tudont populations » Another research study would be to analyze the 
amount of training a student needs to benefit from computerized CRT data. 
Should computerized CRT reports be distributed to third grade students 
who have lo;; quantitative abilities? Or should the results be presented 
to students by a teacher or paraprof essional? Is the student better able 
to utilize the CRT information during his second and third years in a CRT 
program? 



^ 28 - 



KKFFRKNCKS 



L. CVfrirnitte r h-'v-'ono^rilc? DevelopmenL , liUj^^vativ^ns in Edaoatiun; New 
Direeli^^ru- V^ r the Aiiun'ioan School ) Nov; York, July, I968. 

2, Ec^ben^.cn, Tiiarweild , Individuali/.lng the Instruct icnal Program , 

Duluth, NUnner.ota Public Schools, I966. 

3. Cccloy, William, and Glaser, Robert, "An Information and Management 

Syt^tem for Individually Prescribed Instruction," Working Paper 
c:hy Univerraty of Pittsburgh, December. 1968. 

U. Enterprise Elementary School District, "Enterprise Technologically 

Managed Individualized Instruction Program, "Enterprise, Cali- 
fcrnla, January, 1970. 

5. Gilbeman, H.G., "Design Objectives of the Instructional Managej)5ent 

Sy.stem," GP 3038/OOI/OO, Systems Development Corporation, Santa 
Monica, California, 1968, 

6. Gorth, William, "Designing Instructional Systems with Longitudinal 

Testing Using Item Sampling Techniques,'^ a symposium at the 
Annual Meeting of the American Educational Research Association 
in Minneapolis, March, 1970. 

7. Lord, F.M^, and Novick, M.R., ^statistical Theories of Neutral Test 

Scores, Addison-Wesley , Boston, 1965. 

8. Grrth, V/,D., Sehriber,P,E. , and O'Reilly, R.P., Comprehensive 

Achievement Monitoring; Its Design and Use , School of Educa- 
tion, University of Massachusetts, Araherst, Massachusetts, 1971, 

Pinsky, P.D., "Mathematical Models for Measurement and Control of 

Ciassroora Achievement," unpublished dissertation. The Department 
of Operations Research, Stanford University, Stanford, Cal., 1971. 

10. O'Reilly, R.P», "The Conceptualisation of Objectives for Evalua- 

ticn," a paper presented at the Annual Meeting of the Educa- 
tional Research Association in New Orleans • February, 1973. 



] 1 



Boniiutri, J,R.y On t he Theory of Achievement Test Items , The Uni- 
versity of Chicago Press, 1970. 



1^:. Crcnbach, L.J*, and Gleser, :r*C., Psychological Tests and Personnel 
Decisions , University of Illinois Press, Urbana, 19653 Chapter 

^ ' _ 

13. Easter, J.E., et. al. , '^Comprehensive Achievement Monitoring in the 
Sequoia Union High Scl)uol District," a .symj.osium presented at 
the C^ilifv'^rnia Educational Data Processing Associativn Meeting, 
pal.' Alto, Ca] ifornia, December, 1972. 



- 29 - 



